Stork data scheduler: mitigating the data bottleneck in e-Science.
نویسندگان
چکیده
In this paper, we present the Stork data scheduler as a solution for mitigating the data bottleneck in e-Science and data-intensive scientific discovery. Stork focuses on planning, scheduling, monitoring and management of data placement tasks and application-level end-to-end optimization of networked inputs/outputs for petascale distributed e-Science applications. Unlike existing approaches, Stork treats data resources and the tasks related to data access and movement as first-class entities just like computational resources and compute tasks, and not simply the side-effect of computation. Stork provides unique features such as aggregation of data transfer jobs considering their source and destination addresses, and an application-level throughput estimation and optimization service. We describe how these two features are implemented in Stork and their effects on end-to-end data transfer performance.
منابع مشابه
Developing and Validating a New Wireless Wearable Device for Balance Measurement in Sport and Clinical Setting
One of the new clinical techniques to assess the lower body parameters is the wearable ultrasonic sensors. The device which can measure the static and dynamic balance abilities in sport and clinical setting by the traveled signals of ultrasonic transmitter and receiver between two feet was developed and validated. The new device consisted of a pressure gauge and a digital centimeter indicator a...
متن کاملSelected papers from the 2010 e-Science All Hands Meeting.
The annual e-Science All Hands Meeting (AHM) is the premier e-Science conference held regularly in the United Kingdom, and provides a forum for the e-Science community to present and demonstrate their research, exchange ideas and socialize. This Theme Issue, entitled 'e-Science: novel research, new science, and enduring impact', features selected papers from AHM 2010 with the aim of highlightin...
متن کاملRun-time Adaptation of Grid Data Placement Jobs
Grid presents a continuously changing environment. It also introduces a new set of failures. The data grid initiative has made it possible to run data-intensive applications on the grid. Data-intensive grid applications consist of two parts: a data placement part and a computation part. The data placement part is responsible for transferring the input data to the compute node and the result of ...
متن کاملData Replication-Based Scheduling in Cloud Computing Environment
Abstract— High-performance computing and vast storage are two key factors required for executing data-intensive applications. In comparison with traditional distributed systems like data grid, cloud computing provides these factors in a more affordable, scalable and elastic platform. Furthermore, accessing data files is critical for performing such applications. Sometimes accessing data becomes...
متن کاملImproving for Drum_Buffer_Rope material flow management with attention to second bottlenecks and free goods in a job shop environment
Drum–Buffer–Rope is a theory of constraints production planning methodology that operates by developing a schedule for the system’s first bottleneck. The first bottleneck is the bottleneck with the highest utilization. In the theory of constraints, any job that is not processed at the first bottleneck is referred to as a free good. Free goods do not use capacity at the first bottleneck, so very...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Philosophical transactions. Series A, Mathematical, physical, and engineering sciences
دوره 369 1949 شماره
صفحات -
تاریخ انتشار 2011